motion generation
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Vision > Video Understanding (0.31)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Israel (0.04)
CigTime: Corrective Instruction Generation Through Inverse Motion Editing
Recent advancements in models linking natural language with human motions have shown significant promise in motion generation and editing based on instructional text. Motivated by applications in sports coaching and motor skill learning, we investigate the inverse problem: generating corrective instructional text, leveraging motion editing and generation models. We introduce a novel approach that, given a user's current motion (source) and the desired motion (target), generates text instructions to guide the user towards achieving the target motion. We leverage large language models to generate corrective texts and utilize existing motion generation and editing frameworks to compile datasets of triplets (source motion, target motion, and corrective text). Using this data, we propose a new motion-language model for generating corrective instructions. We present both qualitative and quantitative results across a diverse range of applications that largely improve upon baselines. Our approach demonstrates its effectiveness in instructional scenarios, offering text-based guidance to correct and enhance user performance.